Skip to content

Conversation

@bchalios
Copy link
Contributor

@bchalios bchalios commented Apr 2, 2025

Testing CI integration for feature/pcie

roypat and others added 30 commits February 13, 2025 09:43
The jailer wishes to compute the CPU time it spents before exec-ing into
Firecracker, so that it can be passed to Firecracker's `parent-cpu-us`
flag and emitted as a metric.

The CLOCK_PROCESS_CPUTIME_ID is reset on every fork() call, so we need
to stitch together the different values from before the first fork, and
then between each consecutive fork. However, we also need to account for
the scenarios that we do not fork() at all (no daemonization).

Now, this all goes slightly wrong because we want to exclude the time
spent the call to Env::run() from the reported time (note that inside
the jailer, this is only the time spent parsing arguments, but if a user
ended up exec()-ing into the jailer, it would also include the time from
prior to calling exec(), which makes sense to exclude). So we grab the
value of CLOCK_PROCESS_CPUTIME_ID right after parsing arguments, and
subtract it at the end. Sadly, in the case of daemonization, we subtract
it not from the total amount of time spent in the jailer, but rather
from the time spent since the final fork(). This works out if the time
spent since the last fork was greater than the time spent parsing
arguments, but if not, we have an integer underflow on our hands. Now,
this is only a problem in debug builds, and in release builds, we
underflow, and then _over_flow again during the next addition of the
underflowed value to jailer_cpu_time_us, and Firecracker even seens a
sane value for parent_time_cpu_us. But in debug builds, we panic.

Fix this by breaking the subtraction and addition into two separate
statements, to avoid relying on the double over-/underflow

Closes firecracker-microvm#5033

Fixes: 68ab56e
Signed-off-by: Patrick Roy <[email protected]>
In `report_cpu_start_time()`, Firecracker computers the CPU time it took
until the API socket was available as

CLOCK_PROCESS_CPUTIME_ID - cpu_start_time + jailer_time

However, here `cpu_start_time` is always zero currently (it's the value
the jailer passes via `--cpu-start-time`), and in the case where the
jailer exec()s into Firecracker (without prior fork() or clone()), this
is wrong: CLOCK_PROCESS_CPUTIME_ID will now be the CPU time since the
_jailer_ process started, and thus adding the jailer_time to it is
nonsense. This all works out if we set cpu_start_time to be the CPU time
right before we exec() into Firecracker: If we did not fork(), then it
will cause the calculation above to correctly discard all time before
exec()-ing into Firecracker (before re-adding specifically the time
spent in the jailer). If we _did_ fork(), then it will be approximately
zero, and thus preserve the current semantics (as the jailer time will
only be the time _until_ we did the last fork(), and
CLOCK_PROCESS_CPUTIME_ID will only be the absolute CPU time _since_ the
last fork()).

Signed-off-by: Patrick Roy <[email protected]>
The components that go into the calculation are
1. Exclude thing before calling Env::new() (including potential time
   spent in a parent of the jailer itself that ended up exec()-ing
2. The time up until a potential fork() for daemonization
3. The time spent in each temporary process during triple-fork()ing

Signed-off-by: Patrick Roy <[email protected]>
Right now the test would pass on stock AL kernels, because without the
commits from 6.4 [1]  backported the CNTPCT_EL0 register won't even be
included in the snapshot, and thus the assert is never reached.

Fix this by failing the test if the register isnt found.

[1]: https://lore.kernel.org/all/[email protected]/

Fixes: 525e686
Signed-off-by: Patrick Roy <[email protected]>
If we set deflate_on_oom to False, and then induce an out of memory
condition, then we're really only testing how well the guest can handle
oom conditions. Do not fail the test if the guest responds to OOM by
shutting down though.

Signed-off-by: Patrick Roy <[email protected]>
This syscall is issued transparently by the linux kernel when
timing-related syscalls (such as nanosleep) get interrupted, for example
because of SIGSTOP.

Signed-off-by: Patrick Roy <[email protected]>
This syscall is inserted at runtime by the linux kernel, and thus not
actually present in our binary. The static analysis tool thus correctly
marks it as unused. Introduce an allowlist of syscalls that are ignored
by the static analysis tool to deal with this.

Signed-off-by: Patrick Roy <[email protected]>
The access to the `firecracker_pid` property causes a read to the vm's
pid file. This means that in the scenario where the VM is already dead
by the time the monitor starts running, we won't get a `NoSuchProcess`
error, we potentially get a `FileNotFoundError` (if the jail is already
cleaned up). Thus, to avoid spurious failures, also catch
`FileNotFoundError`.

We leave the catch for `NoSuchProcess` error in case a VM exits between
reading the pid file and the call to the`psutils.Process` constructor
(race condition).

Signed-off-by: Patrick Roy <[email protected]>
Running the sandbox command leaves the .ipython folder as owned by root.
Fix this up after exiting so that its can be accesses un-privileged
after docker exits (I have a git pre-commit hook that tries to stash all
unstaged changes, and it always chokes on this folder).

Signed-off-by: Patrick Roy <[email protected]>
Because we're storing dmesg in a local variable, if an exception occurs
while killing a microvm, python will end up printing the entirety of the
host dmesg output to stderr, because it dumps local variables as part of
the backtrace. This is fine in CI because instances are shortlived, but
when running tests locally, it takes my terminal multiple minutes to
catch up with the amount of text that ends up being printed.

Fix this by not storing dmesg in a local variable anymore, and instead
inline it to its only use-site.

Signed-off-by: Patrick Roy <[email protected]>
Bumps the firecracker group with 6 updates:

| Package | From | To |
| --- | --- | --- |
| [zerocopy](https://github.com/google/zerocopy) | `0.8.17` | `0.8.18` |
| [clap](https://github.com/clap-rs/clap) | `4.5.28` | `4.5.29` |
| [cc](https://github.com/rust-lang/cc-rs) | `1.2.13` | `1.2.14` |
| [clap_builder](https://github.com/clap-rs/clap) | `4.5.27` | `4.5.29` |
| [equivalent](https://github.com/indexmap-rs/equivalent) | `1.0.1` | `1.0.2` |
| [toml_edit](https://github.com/toml-rs/toml) | `0.22.23` | `0.22.24` |


Updates `zerocopy` from 0.8.17 to 0.8.18
- [Release notes](https://github.com/google/zerocopy/releases)
- [Changelog](https://github.com/google/zerocopy/blob/main/CHANGELOG.md)
- [Commits](google/zerocopy@v0.8.17...v0.8.18)

Updates `clap` from 4.5.28 to 4.5.29
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@clap_complete-v4.5.28...clap_complete-v4.5.29)

Updates `cc` from 1.2.13 to 1.2.14
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Changelog](https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md)
- [Commits](rust-lang/cc-rs@cc-v1.2.13...cc-v1.2.14)

Updates `clap_builder` from 4.5.27 to 4.5.29
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@v4.5.27...v4.5.29)

Updates `equivalent` from 1.0.1 to 1.0.2
- [Commits](indexmap-rs/equivalent@v1.0.1...v1.0.2)

Updates `toml_edit` from 0.22.23 to 0.22.24
- [Commits](toml-rs/toml@v0.22.23...v0.22.24)

---
updated-dependencies:
- dependency-name: zerocopy
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: cc
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: clap_builder
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: equivalent
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: toml_edit
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
...

Signed-off-by: dependabot[bot] <[email protected]>
Since we dropped the explicit offset field from the snapshot file, we
are implicit computing it as "sum of sizes of all preceding regions". If
the snapshot file is corrupted, it can describe regions whose sum
exceeds u64::MAX. Fix this by adding overflow checks and returning an
error in case of overflows

We also error out if it exceeds i64::MAX as the offset argument to
mmap(2) is a signed 64 bit integer value.

Fixes: d835805
Signed-off-by: Patrick Roy <[email protected]>
https://man7.org/linux/man-pages/man7/pid_namespaces.7.html
> A process in an ancestor namespace can send signals to the "init"
> process of a child PID namespace only if the "init" process has
> established a handelr for that signal.

Firecracker (i.e. the "init" process of the new PID namespace) sets up
handlers for some signals including SIGHUP and jailer exits soon after
spawning firecracker into the new PID namespace. If the jailer process
is a session leader and its exit happens after firecracker configures
the signal handlers, SIGHUP will be sent to firecracker and be caught by
the handler unexpectedly.

In order to avoid the above issue, if jailer is a session leader,
creates a new session and makes the child process (i.e. firecracker)
become the leader of the new session to not get SIGHUP on the exit of
jailer.

Note that this is the case only if `--daemonize` is not passed. This is
because we use the double fork method to daemonize, making itself not a
session leader.

Signed-off-by: Takahiro Itazuri <[email protected]>
Follow a similar approach we already take for the vcpu module, where all
the architecture specific stuff is inside aarch64/x86_64 modules,
keeping only the common code in the `vm` module.

No functional change intended.

Signed-off-by: Patrick Roy <[email protected]>
Introduce `enum ArchVmError` which contains architecture specific
variants for VM operations (mostly related to snapshot
creation/restore). Do this by adapting exiting architecture specific
enums that were used for snapshot restoration (x86_64)/GIC creation
(aarch64).

While we're at it, remove unused/duplicated enum variants, and on x86
use the SetIrqChip enum variant for failures inside `setup_irq_chip`
instead of producing some generic `VmError`.

Admittedly, there's a lot more cleanup that can be done here, but for
now this is enough to attain my goals (separating architecture specific
code).

No functional change intended (apart from error messages changing).

Signed-off-by: Patrick Roy <[email protected]>
Similarly to what we do in the vcpu module for the Vcpu/KvmVcpu
structs. The difference is that we do not implement any function on
`ArchVm`, so that common fields like `fd` can live in `Vm`, instead of
needing to be duplicated into all the `ArchVm` types.

Signed-off-by: Patrick Roy <[email protected]>
This allows us to not need to feed `struct Kvm` all the way down to Vcpu
creation.

While we're at it, abstract away the `MsrList` type, and just have a
helper that returns a slice of u32 (as every call site converts the
MsrList to this anyway).

Signed-off-by: Patrick Roy <[email protected]>
This abstracts away the weird irqchip setup dance that builder.rs has
been doing (where depending on architecture, irqchip needs to be created
before or after vcpu creation. Removes a whole bunch of code that
builder.rs that didn't really have too much business being there

Signed-off-by: Patrick Roy <[email protected]>
Instead of doing the irqchip dance inside unit testing code, just call
vm.create_vcpus, which does it for us.

Signed-off-by: Patrick Roy <[email protected]>
We have some utility functions for creating dummy VMs/etc. in unittests,
so let's try to use them instead of copy-pasting the same few lines
around.

Signed-off-by: Patrick Roy <[email protected]>
There's no need to use the GuestMemoryMmap to resolve the pointer to the
start of the GuestMemoryRegion by resolving the guest physical address
it starts at - because we literally store that pointer inside the
GuestMemoryRegion, so can just use that.

Signed-off-by: Patrick Roy <[email protected]>
Since we issue this ioctl with a constant address, it doesn't really
matter _when_ we do it. Internally, it causes a hidden memslot of size 3
pages to be created. If we create it _after_ all other memslots, then
this ioctl can fail with EEXIST if it would overlap some pre-existing
memslot. Now, instead the creation of the overlapping memslot would fail
with EEXIST. But this is a moot point, because if Firecracker ever tried
to create a memslot that overlaps this one, something is fundamentally
broken.

Signed-off-by: Patrick Roy <[email protected]>
All of these were dead code (but the compiler isn't able to tell us
because the vmm crate is a library crate where everything is `pub`, and
it doesn't know that no downstream users outside of the `firecracker`
crate exist, and that everything that's not used in `firecracker` is
dead code).

Signed-off-by: Patrick Roy <[email protected]>
Dropping a GuestMemoryMmap causes the memory to get unmapped, so if we
try to KVM_RUN after dropping it, we'll just get EFAULT.

Admittedly no idea how we didn't run into this issue before.

Signed-off-by: Patrick Roy <[email protected]>
Removed the extended function entry normalisation step from AMD CPUs.
We will now pass through the value provided by KVM or set through a CPU
template. This now mirrors the behaviour of Intel. On newer AMD chips
and newer kernels we are now able to pass through features which were
previously out of range.

Signed-off-by: Jack Thomson <[email protected]>
Add entry for the removal of extended function entry normalisation.

Signed-off-by: Jack Thomson <[email protected]>
The error variant is used when creating a new VM from KVM FD via
KVM_CREATE_VM call. On success, VM FD is returned. The previous error
message doesn't tell a lie but is a bit unintuitive.

Signed-off-by: Takahiro Itazuri <[email protected]>
It is known that KVM_CREATE_VM fails with EINTR on heavily loaded
machines with many VMs. It might be a kernel bug but apparently has not
been fixed. To mitigate it, QEMU does an infinitely retry on EINTR.
Similar, do retries up to 5 times.

Signed-off-by: Takahiro Itazuri <[email protected]>
No reason not to have it in CHANGELOG.

Signed-off-by: Takahiro Itazuri <[email protected]>
Bumps the firecracker group with 18 updates:

| Package | From | To |
| --- | --- | --- |
| [zerocopy](https://github.com/google/zerocopy) | `0.8.18` | `0.8.20` |
| [clap](https://github.com/clap-rs/clap) | `4.5.29` | `4.5.31` |
| [uuid](https://github.com/uuid-rs/uuid) | `1.13.1` | `1.14.0` |
| [libc](https://github.com/rust-lang/libc) | `0.2.169` | `0.2.170` |
| [serde](https://github.com/serde-rs/serde) | `1.0.217` | `1.0.218` |
| [serde_json](https://github.com/serde-rs/json) | `1.0.138` | `1.0.139` |
| [serde_derive](https://github.com/serde-rs/serde) | `1.0.217` | `1.0.218` |
| [log](https://github.com/rust-lang/log) | `0.4.25` | `0.4.26` |
| [aws-lc-rs](https://github.com/aws/aws-lc-rs) | `1.12.2` | `1.12.4` |
| [aws-lc-sys](https://github.com/aws/aws-lc-rs) | `0.25.1` | `0.26.0` |
| [cc](https://github.com/rust-lang/cc-rs) | `1.2.14` | `1.2.15` |
| [clap_builder](https://github.com/clap-rs/clap) | `4.5.29` | `4.5.31` |
| [either](https://github.com/rayon-rs/either) | `1.13.0` | `1.14.0` |
| [inout](https://github.com/RustCrypto/utils) | `0.1.3` | `0.1.4` |
| [typenum](https://github.com/paholg/typenum) | `1.17.0` | `1.18.0` |
| [unicode-ident](https://github.com/dtolnay/unicode-ident) | `1.0.16` | `1.0.17` |
| [uuid-macro-internal](https://github.com/uuid-rs/uuid) | `1.13.1` | `1.14.0` |
| [winnow](https://github.com/winnow-rs/winnow) | `0.7.2` | `0.7.3` |


Updates `zerocopy` from 0.8.18 to 0.8.20
- [Release notes](https://github.com/google/zerocopy/releases)
- [Changelog](https://github.com/google/zerocopy/blob/main/CHANGELOG.md)
- [Commits](google/zerocopy@v0.8.18...v0.8.20)

Updates `clap` from 4.5.29 to 4.5.31
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@clap_complete-v4.5.29...v4.5.31)

Updates `uuid` from 1.13.1 to 1.14.0
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](uuid-rs/uuid@1.13.1...v1.14.0)

Updates `libc` from 0.2.169 to 0.2.170
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Changelog](https://github.com/rust-lang/libc/blob/0.2.170/CHANGELOG.md)
- [Commits](rust-lang/libc@0.2.169...0.2.170)

Updates `serde` from 1.0.217 to 1.0.218
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.217...v1.0.218)

Updates `serde_json` from 1.0.138 to 1.0.139
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](serde-rs/json@v1.0.138...v1.0.139)

Updates `serde_derive` from 1.0.217 to 1.0.218
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.217...v1.0.218)

Updates `log` from 0.4.25 to 0.4.26
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](rust-lang/log@0.4.25...0.4.26)

Updates `aws-lc-rs` from 1.12.2 to 1.12.4
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](aws/aws-lc-rs@v1.12.2...v1.12.4)

Updates `aws-lc-sys` from 0.25.1 to 0.26.0
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](aws/aws-lc-rs@aws-lc-sys/v0.25.1...aws-lc-sys/v0.26.0)

Updates `cc` from 1.2.14 to 1.2.15
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Changelog](https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md)
- [Commits](rust-lang/cc-rs@cc-v1.2.14...cc-v1.2.15)

Updates `clap_builder` from 4.5.29 to 4.5.31
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@v4.5.29...v4.5.31)

Updates `either` from 1.13.0 to 1.14.0
- [Commits](rayon-rs/either@1.13.0...1.14.0)

Updates `inout` from 0.1.3 to 0.1.4
- [Commits](RustCrypto/utils@inout-v0.1.3...inout-v0.1.4)

Updates `typenum` from 1.17.0 to 1.18.0
- [Release notes](https://github.com/paholg/typenum/releases)
- [Changelog](https://github.com/paholg/typenum/blob/main/CHANGELOG.md)
- [Commits](paholg/typenum@v1.17.0...v1.18.0)

Updates `unicode-ident` from 1.0.16 to 1.0.17
- [Release notes](https://github.com/dtolnay/unicode-ident/releases)
- [Commits](dtolnay/unicode-ident@1.0.16...1.0.17)

Updates `uuid-macro-internal` from 1.13.1 to 1.14.0
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](uuid-rs/uuid@1.13.1...v1.14.0)

Updates `winnow` from 0.7.2 to 0.7.3
- [Changelog](https://github.com/winnow-rs/winnow/blob/main/CHANGELOG.md)
- [Commits](winnow-rs/winnow@v0.7.2...v0.7.3)

---
updated-dependencies:
- dependency-name: zerocopy
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: aws-lc-rs
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: aws-lc-sys
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: cc
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: clap_builder
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: either
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: inout
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: typenum
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: unicode-ident
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: uuid-macro-internal
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: winnow
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
...

Signed-off-by: dependabot[bot] <[email protected]>
roypat and others added 29 commits March 27, 2025 10:25
Almost all the free functions in arch::arch64::vcpu were just taking a
vcpu reference as a first parameter (in some form). So just turn these
into methods implemented on KvmVcpu that take &self.

Signed-off-by: Patrick Roy <[email protected]>
Remove some more .map_err() noise in the mmio device registration code
in builder.rs by using #[from] and returning more specialized error
types.

Signed-off-by: Patrick Roy <[email protected]>
This is to keep free space in the /srv filesystem.

Signed-off-by: Nikita Kalyazin <[email protected]>
If UMWEAIT/TPAUSE instructions are executed within a guest, a physical
processor is put into an implementation-depedent optimized state. It may
lead to an overhead of waking it up when scheduling another guest on it.
See comment in code for more details.

Signed-off-by: Takahiro Itazuri <[email protected]>
We disabled WAITPKG intentionally since it can put a physical processor
into an implementation-depedent optimized state. So it should not exist
in guest.

Signed-off-by: Takahiro Itazuri <[email protected]>
We prevent Rust from closing the socket file descriptor to avoid a
potential race condition between the mappings message and the connection
shutdown.  If the latter arrives at the UFFD handler first, the handler
never sees the mappings.

Signed-off-by: Nikita Kalyazin <[email protected]>
The restore_from_snapshot function did not integrate well with
uffd-based snapshot restore: Even if a UFFD path was specified, it still
created a copy of the snapshot memory file inside the chroot, even
though the UFFD handler set this up long ago in space_pf_handler.

Fix this, and while we're at it, also remove the need for passing in
uffd handler and snapshot file explicitly when using uffd-based restore,
as the spawn_pf_handler sets the uffd_handler field of the microvm
object, and can also easily be made to actually contain the snapshot
from which page faults are being served.

Signed-off-by: Patrick Roy <[email protected]>
If build_n_from_snapshot is asked to build many snapshots incrementally,
it doesn't clean up after itself properly: Each iteration ends by
creating a new snapshot of the VM, and this snapshot is passed to
iteration n+1. Here, a copy is created inside the new VMs chroot, and
iteration n+1 ends by creating a new snapshot for iteration n+2, and
deleting the copy of the snapshot inside the chroot. However, we never
delete the snapshot created in iteration n, and so with each iteration
more snapshots accumulate.

Fix this by having the function delete the snapshot created in iteration
n after iteration n+2 finished successfully.  The idea here is that in
case of failure, we will have the snapshot created in iteration n+1 (the
one which caused a failure in n+2), and also the snapshot created in n
(which was the last known snapshot to successfully go through a test
iteration, namely iteration n+1).

Signed-off-by: Patrick Roy <[email protected]>
It was bothering me that there was an empty line.

Signed-off-by: Patrick Roy <[email protected]>
Follow the official migration guide for upgrading with serde support,
described in [1]

[1]: https://github.com/bincode-org/bincode/blob/trunk/docs/migration_guide.md#migrating-with-serde

Signed-off-by: Patrick Roy <[email protected]>
…ates

Bumps the firecracker group with 6 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [clap](https://github.com/clap-rs/clap) | `4.5.32` | `4.5.35` |
| [micro_http](https://github.com/firecracker-microvm/micro-http) | ``18f5b6f`` | ``e854e50`` |
| [env_logger](https://github.com/rust-cli/env_logger) | `0.11.7` | `0.11.8` |
| [aws-lc-rs](https://github.com/aws/aws-lc-rs) | `1.12.6` | `1.13.0` |
| [aws-lc-fips-sys](https://github.com/aws/aws-lc-rs) | `0.13.4` | `0.13.5` |
| [once_cell](https://github.com/matklad/once_cell) | `1.21.1` | `1.21.3` |



Updates `clap` from 4.5.32 to 4.5.35
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@clap_complete-v4.5.32...clap_complete-v4.5.35)

Updates `micro_http` from `18f5b6f` to `e854e50`
- [Commits](firecracker-microvm/micro-http@18f5b6f...e854e50)

Updates `env_logger` from 0.11.7 to 0.11.8
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.11.7...v0.11.8)

Updates `aws-lc-rs` from 1.12.6 to 1.13.0
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](aws/aws-lc-rs@v1.12.6...v1.13.0)

Updates `aws-lc-fips-sys` from 0.13.4 to 0.13.5
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](aws/aws-lc-rs@aws-lc-fips-sys/v0.13.4...aws-lc-fips-sys/v0.13.5)

Updates `aws-lc-sys` from 0.27.1 to 0.28.0
- [Release notes](https://github.com/aws/aws-lc-rs/releases)
- [Commits](aws/aws-lc-rs@aws-lc-sys/v0.27.1...aws-lc-sys/v0.28.0)

Updates `clap_builder` from 4.5.32 to 4.5.35
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](clap-rs/clap@v4.5.32...v4.5.35)

Updates `once_cell` from 1.21.1 to 1.21.3
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](matklad/once_cell@v1.21.1...v1.21.3)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: micro_http
  dependency-type: direct:production
  dependency-group: firecracker
- dependency-name: env_logger
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: aws-lc-rs
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: aws-lc-fips-sys
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: aws-lc-sys
  dependency-type: indirect
  update-type: version-update:semver-minor
  dependency-group: firecracker
- dependency-name: clap_builder
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
- dependency-name: once_cell
  dependency-type: indirect
  update-type: version-update:semver-patch
  dependency-group: firecracker
...

Signed-off-by: dependabot[bot] <[email protected]>
The explicit calls to GuestMemoryMmap::create ended up being touched in
like 5 of the following commits, so let's reduce the churn by putting it
into a function.

Signed-off-by: Patrick Roy <[email protected]>
Have all of them operate on some sort of list of (GuestAddress, usize)
tuples, for consistency.

Signed-off-by: Patrick Roy <[email protected]>
This struct will hold all the fields that are common between the x86_64
and aarch64 definition of `struct ArchVm`. For now, this is only the
`VmFd`, but this will grow.

Signed-off-by: Patrick Roy <[email protected]>
This is to prepare for memory regions being registered to a VM
incrementally.

While we're at it, add a unit test for it. It completes in roughly 50s
on my machine, and will become more useful later once we actually allow
building mmap regions incrementally.

Signed-off-by: Patrick Roy <[email protected]>
Semantically, this is where it belong - the memory is associated with
the virtual machine, not the virtual machine manager.

But independently of this, it is needed for borrowchk reasons: In the
future, each Vm will have two guest memory regions associated with it.
One for traditional guest memory, and another that is dedicated to I/O
(swiotlb region).  Storing these two GuestMemoryMmap in `struct Vm`
allows for a function as follows:

fn io_memory(&self) -> &GuestMemoryMmap {
    self.io_memory.as_ref().unwrap_or(&self.normal_memory)
}

Such a function cannot live in `struct Vmm`, as that would mean
borrowing the guest memory implies borrowing the entire `Vmm` object
(because rustc is not intelligent enough to project split borrows past
function calls - e.g. it doesn't know the references returned by
io_memory() really only borrows the io_memory and normal_memory fields,
not everything else.). However, this means it cannot be used in
callsites where such splitting of borrows is required (e.g. everywhere
where both a guest memory reference _and_ another Vmm field are passed
by reference to a function).

The `Option`-ness of the the guest_memory field is transitory.

Signed-off-by: Patrick Roy <[email protected]>
This is in preparation of guest memory itself only getting set up after
the call to create_vmm_and_vcpus().

Signed-off-by: Patrick Roy <[email protected]>
Add `Vm::register_memory_region[s]`, which allow adding new memory
regions to a Vm incrementally (e.g. without overwriting the previously
registered regions).

To use this, have the functions in the `memory` module hand out
Vec<GuestMemoryRegion> instead of GuestMemoryMmap, because vm_memory
does not offer APIs for extracing the regions of a GuestMemoryMmap
again.

Signed-off-by: Patrick Roy <[email protected]>
Since the guest memory is now stored in `struct Vm`, we can more easily
store the guest memory state inside the VmState.

Signed-off-by: Patrick Roy <[email protected]>
Now that guest memory is tracked inside struct Vm instead of struct Vmm,
the dirty bitmap tracking functions can also start living in struct Vm.

While we're touching it anyway, update the types used to kvm memslot
numbers to be the native u32. This saves us a few ugly usize->u32 cast.

Also change the error type as to not drag in VmmError into the vm
module, where it really doesn't belong.

Signed-off-by: Patrick Roy <[email protected]>
Do this in create_snapshot() instead of snapshot_memory_to_file(), as it
was the only operation inside snapshot_memory_to_file() that required
access to a struct Vmm, opening the gates for move the rest of
snapshot_memory_to_file() into struct Vm.

Signed-off-by: Patrick Roy <[email protected]>
struct Vm encapsulates the memory, so having this function live there
makes sense. It also prepares us for the future where multiple
GuestMemoryMmap objects need to be managed.

Signed-off-by: Patrick Roy <[email protected]>
Avoid some .map_err() calls using #[from] and the auto-From::from
calling of the question mark operator.

Signed-off-by: Patrick Roy <[email protected]>
We can just immediately propagate errors to the caller using `?` syntax,
which saves us the "if (...).is_ok()" check.

Signed-off-by: Patrick Roy <[email protected]>
Otherwise it's impossible to figure out what precisely went wrong during
queue restoration.

Signed-off-by: Patrick Roy <[email protected]>
This allows having regions that start somewhere other than guest
physical address 0. Useful in case not all regions are allocated at
once, and each batch needs to be placed after all already allocated
regions.

Since the resulting code has some intricacies (and I actually got it
wrong the first time around), write a kani harness to validate that
we're computing the regions correctly.

On ARM, add a warning in case we hit some questionable logic around
truncation of memory sizes if they exceed architectural bounds (although
if anyone is hitting this, they're already doing something wrong anyway,
because it'd require a VM with more than 1022GiB of memory).

Signed-off-by: Patrick Roy <[email protected]>
This way the non-snapshot path doesnt have to deal with passing in
`None`.

Signed-off-by: Patrick Roy <[email protected]>
Only enabling it when a balloon device is present doesn't really gain us
anything, because the only time this feature is checked in the kernel is
as part of madvise handling - but without balloon devices we wont ever
call madvise, so it makes no difference whether we enabled the feature
or not (it is unconditionally available since linux 4.10).

Signed-off-by: Patrick Roy <[email protected]>
Dropping the VM object closes the VM FD, which can cause the test to
hang.

Signed-off-by: Patrick Roy <[email protected]>
@bchalios bchalios merged commit 45fc21b into firecracker-microvm:feature/pcie Apr 2, 2025
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.